Exploring Geometric Property Thresholds For Filtering Non-Text Regions In A Connected Component Based Text Detection Application

نویسنده

  • Teresa Nicole Brooks
چکیده

Automated text detection is a difficult computer vision task. In order to accurately detect and identity text in an image or video, two major problems must be addressed. The primary problem is implementing a robust and reliable method for distinguishing text vs non-text regions in images and videos. Part of the difficulty stems from the almost unlimited combinations of fonts, lighting conditions, distortions, and other variations that can be found in images and videos. This paper explores key properties of two popular and proven methods for implementing text detection; maximum stable external regions (MSER) and stroke width variation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Natural scene text localization using edge color signature

Localizing text regions in images taken from natural scenes is one of the challenging problems dueto variations in font, size, color and orientation of text. In this paper, we introduce a new concept socalled Edge Color Signature for localizing text regions in an image. This method is able to localizeboth Farsi and English texts. In the proposed method rst a pyramid using diff...

متن کامل

Directional Stroke Width Transform to Separate Text and Graphics in City Maps

One of the complex documents in the real world is city maps. In these kinds of maps, text labels overlap by graphics with having a variety of fonts and styles in different orientations. Usually, text and graphic colour is not predefined due to various map publishers. In most city maps, text and graphic lines form a single connected component. Moreover, the common regions of text and graphic lin...

متن کامل

Detection of Text with Connected Component Clustering

Text detection and recognition is a hot topic for researchers in the field of image processing. It gives attention to Content based Image Retrieval (CBIR) community in order to fill the semantic gap between low level and high level features. Several methods have been developed for text detection and extraction that achieve reasonable accuracy for natural scene text (camera images) as well as mu...

متن کامل

Skew detection for complex document images using robust borderlines in both text and non-text regions

0167-8655/$ see front matter 2008 Elsevier B.V. A doi:10.1016/j.patrec.2008.06.008 * Corresponding author. Address: National Lab on University, Beijing 100871, China. Fax: +86 10 62755 E-mail address: [email protected] (H. Liu). A new skew detection method for complex document images based on robust borderlines extracted from both text and non-text regions is proposed in this paper. First, bor...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.03548  شماره 

صفحات  -

تاریخ انتشار 2017